Applied Bayesian Analyses in R

Part2: prior probability distributions

Sven De Maeyer

Intro

When to Worry and How to Avoid Misuse of Bayesian Statistics

by Laurent Smeets and Rens van der Schoot

Before estimating the model:

1. Do you understand the priors?

After estimation before inspecting results:

  1. Does the trace-plot exhibit convergence?
  2. Does convergence remain after doubling the number of iterations?
  3. Does the posterior distribution histogram have enough information?
  4. Do the chains exhibit a strong degree of autocorrelation?
  5. Do the posterior distributions make substantive sense?

Understanding the exact influence of the priors

  1. Do different specification of the multivariate variance priors influence the results?
  2. Is there a notable effect of the prior when compared with non-informative priors?
  3. Are the results stable from a sensitivity analysis?
  4. Is the Bayesian way of interpreting and reporting model results used?

Focus on the priors before estimation

Remember: priors come in many disguises

Uninformative/Weakly informative

When objectivity is crucial and you want let the data speak for itself…

Informative

When including significant information is crucial

  • previously collected data
  • results from former research/analyses
  • data of another source
  • theoretical considerations
  • elicitation

What do we exactly mean by “informativeness”?

  • informativeness refers to how strong the information in the prior determines the posterior

or

  • “The ultimate significance of this information, and hence the prior itself, depends on exactly how that information manifests in the final analysis.” 1

“Beautifull parents have more daughters” 1

If I dichotomize the respondents into those who are rated “very attractive” and everyone else, the difference in the proportion of sons between the two groups (0.52 vs. 0.44) is statistically significant (t = 2.44, p<0.05).

n = 3000

Impact of priors … 1

Parameter: difference in % girls for both groups
Best estimate using a “flat prior”: 8% difference between both groups

Impact of priors … 1

Parameter: difference in % girls for both groups
What we know from extensive research:

  • proportion of girl births is stable around 48.5%
  • only small effects due to selective abortion, infanticide, and extreme poverty and famine
  • effects around 0.5% difference

So, we could argue for a full informative prior like a normal distribution with mean = 0 and sd = 0.001

Impact of priors … 1

Parameter: difference in % girls for both groups

Best estimate using a using a full informative prior: 0.007% difference

Impact of priors … 1

Parameter: difference in % girls for both groups

Best estimate using a using a weakly informative prior: 0.2% difference

Impact of priors … 1

What’s the problem?

  • difference in proportion girls in population is almost certainly less than 1%
  • available data (3000 surveyed respondents) is weak
  • uniform prior is a strong statement: difference can be large!

Prior can only be interpreted in the context of the likelihood

Uninformative priors can become very informative

  • For each parameter in the model we set priors

  • In a complex model there can be a complex interplay between priors

  • Setting weak priors for each single parameter may result in a lot of information…

Informativeness compared to the likelihood…

Informativeness can only be judged in comparison to the likelihood

  • Prior predictive checking (see later) can help to see the informativeness on the scale of the outcome ==> especially helpful for large models

Suggested help source: https://github.com/stan-dev/stan/wiki/Prior-Choice-Recommendations

brms defaults

  • Weakly informative priors

  • If dataset is big, impact of priors is minimal

  • But, always better to know what you are doing!

  • Complex models might run into convergence issues \(\rightarrow\) specifying more informative priors might help!

So, how to deviate from the defaults?

We have to think!!

Remember our model 2 for Marathon Times (see slides Part 1):

\[\begin{aligned} & \text{MarathonTimeM}_i \sim N(\mu,\sigma_e)\\ & \mu = \beta_0 + \beta_1*\text{km4week}_i + \beta_2*\text{sp4week}_i \end{aligned}\]

What are potential priors for each of the parameters?

What do we need to be aware of to start thinking in priors?



Note: I centred both km4week and sp4week around their mean!

Preparations for applying it to Marathon model

Packages needed:

library(here)
library(tidyverse)
library(brms)
library(bayesplot)
library(ggmcmc)
library(patchwork)
library(priorsense)

Preparations for applying it to Marathon model

Load the dataset and the model:

load(
  file = here("Presentations", "MarathonData.RData")
)

MarathonTimes_Mod2 <-
  readRDS(file = 
            here("Presentations",
              "Output",
              "MarathonTimes_Mod2.RDS")
          )

Check priors used by brms

Function: get_prior( )

Remember our model 2 for Marathon Times:

\[\begin{aligned} & \text{MarathonTimeM}_i \sim N(\mu,\sigma_e)\\ & \mu = \beta_0 + \beta_1*\text{km4week}_i + \beta_2*\text{sp4week}_i \end{aligned}\]

get_prior(
  MarathonTimeM ~ 1 + km4week + sp4week, 
  data = MarathonData
)

Check priors used by brms

  • prior: type of prior distribution
  • class: parameter class (with b being population-effects)
  • coef: name of the coefficient within parameter class
  • group: grouping factor for group-level parameters (when using mixed effects models)
  • resp : name of the response variable when using multivariate models
  • lb & ub: lower and upper bound for parameter restriction

Visualizing priors

The best way to make sense of the priors used is visualizing them!

Many options:

Visualizing priors with ggplot2

Generate a plot for the prior of the effects of independent variables

# Setting a plotting theme
theme_set(theme_linedraw() +
            theme(text = element_text(family = "Times", size = 10),
                  panel.grid = element_blank(),
                  plot.title = element_markdown()))

# Say we set a Normal distribution with mean = 0 and sd = 10
Prior_betas <- ggplot( ) +
  stat_function(
    fun = dnorm,    # We use the normal distribution
    args = list(mean = 0, sd = 10), # 
    xlim = c(-20,20)
  ) +
  scale_y_continuous(name = "density") +
  labs(title = "Prior for the effects of independent variables",
       subtitle = "N(0,10)")

Prior_betas 

Visualizing priors with ggplot2

Probability density plots for slopes of both predictors

Visualizing priors with ggplot2

Generate the plot for the prior of the Intercept (mu)

library(metRology)

# Use the brms default t-distribution

Prior_mu <- ggplot( ) +
  stat_function(
    fun = dt.scaled,    # We use the dt.scaled function of metRology
    args = list(df = 3, mean = 199.2, sd = 24.9), # 
    xlim = c(120,300)
  ) +
  scale_y_continuous(name = "density") +
  labs(title = "Prior for the intercept",
       subtitle = "student_t(3,199.2,24.9)")

Prior_mu 

Visualizing priors with ggplot2

Probability density plot for the intercept

Visualizing priors with ggplot2

Generate the plot for the prior of the error variance (sigma)

# Use the brms default t-distribution

Prior_sigma <- ggplot( ) +
  stat_function(
    fun = dt.scaled,    # We use the dt.scaled function of metRology
    args = list(df = 3, mean = 0, sd = 24.9), # 
    xlim = c(0,6)
  ) +
  scale_y_continuous(name = "density") +
  labs(title = "Prior for the residual variance",
       subtitle = "student_t(3,0,24.9)")

Prior_sigma  

Visualizing priors with ggplot2

Probability density plot for the error variance

Understanding priors… Another example

Experimental study (pretest - posttest design) with 3 conditions:

  • control group
  • experimental group 1
  • experimental group 2

Model:

\[\begin{aligned} & Posttest_{i} \sim N(\mu,\sigma_{e_{i}})\\ & \mu = \beta_0 + \beta_1*\text{Pretest}_{i} + \beta_2*\text{Exp_cond1}_{i} + \beta_3*\text{Exp_cond2}_{i} \end{aligned}\]

With: pretest and posttest standardized (mean = 0 ; sd = 1)

Our job: coming up with priors that reflect that we expect both conditions to have a positive effect (hypothesis based on literature) and no indications that one experimental condition will outperform the other.

Understanding priors… Another example

  • Assuming pre- and posttest are standardized
  • Assuming no increase between pre- and posttest in control condition

Understanding priors… Another example

  • Assuming a strong correlation between pre- and posttest

Understanding priors… Another example

  • Assuming a small effect of experimental conditions
  • No difference between both experimental conditions

Remember Cohen’s d: 0.2 = small effect size; 0.5 = medium effect size; 0.8 or higher = large effect size

Setting custom priors in brms


Setting our custom priors can be done with set_prior( ) command


E.g., change the priors for the beta’s (effects of km4week and sp4week):


Custom_priors <- 
  c(
    set_prior(
      "normal(0,10)", 
      class = "b", 
      coef = "km4week"),
    set_prior(
      "normal(0,10)", 
      class = "b", 
      coef = "sp4week")
    )

Prior Predictive Check

Put priors in the context of the likelihood


Did you set sensible priors?


  • Simulate data based on the model and the priors


  • Visualize the simulated data and compare with real data


  • Check if the plot shows impossible simulated datasets

In brms


Step 1: Fit the model with custom priors with option sample_prior="only"


Fit_Model_priors <- 
  brm(
    MarathonTimeM ~ 1 + km4week + sp4week, 
    data = MarathonData,
    prior = Custom_priors,
    backend = "cmdstanr",
    cores = 4,
    sample_prior = "only"
    )

Note

This will not work with the default priors of brms as brmssets flat priors for beta’s. So, you have to set custom priors for the effects of independent variables to be able to perform the prior predictive checks.

In brms


Step 2: visualize the data with the pp_check( ) function


set.seed(1975)

pp_check(
  Fit_Model_priors, 
  ndraws = 300) # number of simulated datasets you wish for

In brms

Check some summary statistics

  • How are summary statistics of simulated datasets (e.g., median, min, max, …) distributed over the datasets?

  • How does that compare to our real data?

  • Use type = "stat" argument within pp_check()

pp_check(Fit_Model_priors, 
         type = "stat", 
         stat = "median")

Check some summary statistics

Prior sensitivity analyses

Why prior sensitivity analyses?

  • Often we rely on ‘arbitrary’ chosen (default) weakly informative priors

  • What is the influence of the prior (and the likelihood) on our results?

  • You could ad hoc set new priors and re-run the analyses and compare (a lot of work, without strict sytematical guidelines)

  • Semi-automated checks can be done with priorsense package

Using the priorsense package

Recently a package dedicated to prior sensitivity analyses is launched

# install.packages("remotes")
remotes::install_github("n-kall/priorsense")

Key-idea: power-scaling (both prior and likelihood)

background reading:

YouTube talk:

Basic table with indices

First check is done by using the powerscale_sensitivity( ) function

  • column prior contains info on sensitivity for prior (should be lower than 0.05)

  • column likelihood contains info on sensitivity for likelihood (that we want to be high, ‘let our data speak’)

  • column diagnosis is a verbalization of potential problem (- if none)

powerscale_sensitivity(MarathonTimes_Mod2)

Basic table with indices

Sensitivity based on cjs_dist:
# A tibble: 4 × 4
  variable       prior likelihood diagnosis
  <chr>          <dbl>      <dbl> <chr>    
1 b_Intercept 0.000858     0.0856 -        
2 b_km4week   0.000515     0.0807 -        
3 b_sp4week   0.000372     0.0837 -        
4 sigma       0.00574      0.152  -        

Visualization of prior sensitivity

powerscale_plot_dens(
  powerscale_sequence(
    MarathonTimes_Mod2
    ),
  variables = c(
      "b_Intercept",
      "b_km4week",
      "b_sp4week"
    )
  )

Visualization of prior sensitivity

Visualization of prior sensitivity

powerscale_plot_quantities(
  powerscale_sequence(
    MarathonTimes_Mod2
    ),
  variables = c(
      "b_km4week"
      )
  )

Visualization of prior sensitivity